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Nosocomiicoccus massiliensis strain NP2' sp. nov. is the type strain of a new species within the genus 
Nosocomiicoccus. This strain, whose genome is described here, was isolated from the fecal flora of 
an AIDS-infected patient living in Marseille, France. N. massiliensis is a Cram-positive aerobic 
coccus. Here we describe the features of this organism, together with the complete genome se- 
quence and annotation. The 1,645,244 bp long genome (one chromosome but no plasmid) contains 
1 ,738 protein-coding and 45 RNA genes, including 3 rRNA genes. 



Introduction 

Nosocomiicoccus massiliensis strain NP2 T (= CSUR 
P246 = DSM 26222] is the type strain of N. 
massiliensis sp. nov. This bacterium is a Gram- 
positive, non-spore-forming, indole negative, aero- 
bic and motile coccus that was isolated from the 
stool of an AIDS-infected patient living in Marseille 
(France] and is part of a "culturomics" study aiming 
at cultivating all species within human feces [1,2]. 

The current prokaryote species classification, 
known as polyphasic taxonomy, is based on a com- 
bination of genomic and phenotypic properties [3]. 
With each passing year, the number of completely 
sequenced genomes increases geometrically while 
the cost of such techniques decreases. More than 
4,000 bacterial genomes have been published and 
approximately 15,000 genome projects are antici- 
pated to be completed in the near future [4]. We 
recently proposed to integrate genomic infor- 
mation in the taxonomic framework and descrip- 
tion of new bacterial species [5-22]. 

Here we present a summary classification and a set 
of features for N. massiliensis sp. nov. strain NP2 T (= 
CSUR P246 = DSM 26222], together with the de- 
scription of the complete genomic sequence and its 
annotation. These characteristics support the cir- 
cumscription of the species N. massiliensis. The ge- 
nus Nosocomiicoccus Alves etal. 2008 was created 
on the basis of 16S rRNA gene sequence and 
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phenotypic analyses within the family 
Staphylococcaceae [23]. To date, this genus is com- 
prised of a single species, N. ampullae, which was 
isolated from the surface of saline bottles used for 
washing wounds in hospital wards [23]. 

Classification and features 

A stool sample was collected from an HIV-infected 
patient living in Marseille (France]. The patient 
gave an informed and signed consent. This study 
and the assent procedure were approved by the 
ethics committee of the IFR48 (Marseille, France] 
under reference 09-022. The fecal specimen was 
preserved at -80°C after collection. Strain NP2 T 
(Table 1] was isolated in January 2012 by aerobic 
cultivation on 5% sheep blood agar (BioMerieux, 
Marcy l'Etoile, France] at 37°C, after 14-days of 
preincubation of the stool sample in a blood culture 
bottle supplemented with 5 ml of sterile ovine ru- 
men fluid. This strain exhibited a 97% nucleotide 
sequence similarity with N. ampullae [23] and a 
range of 92-94% nucleotide sequence similarity to 
the most closely related members of the genus 
Jeotgalicoccus [34] (Figure 1]. These values were 
lower than the 98.7% 16S rRNA gene sequence 
threshold recommended by Stackebrandt and 
Ebers to delineate a new species without carrying 
out DNA-DNA hybridization [35]. 
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Table 1. Classification and general features of Nosocomiicoccus massiliensis strain NP2 T 



MIGS ID Property 



Term 



Evidence code 3 



Current classification 



Domain Bacteria TAS [24] 

Phylum Firmicutes TAS [25-27] 

Class Bacilli TAS [28,29] 

Order Bacillales TAS [30,31] 

Family Staphylococcaceae TAS [28,32] 

Genus Nosocomiicoccus TAS [23] 
Species Nosocomiicoccus massiliensis IDA 
Type strain NP2 T 





f~! m cfa i n 
<vj [ d I I 1 :> Ld 1 1 1 


rUMU Vtr 


IDA 




("ell shane 


Cocci 


IDA 




I \ t\/ 
1 V1ULI 1 1 L y 


KA r\t 1 1 ci 


IDA 

1 1 J l\ 




Qnnn nation 


\j nncnnn i latino 

INUI I5UUI U IdLU \ hL 


IDA 

1 YJi\ 




~T ci rn n r" 3 1 1 i rci ranop 
1 CI 1 IIJCI dLU I C I dl ItdC 


/ vicjuiji i i i tr 


mA 

1 1 J i\ 




^-/[JllI 1 1UI 1 1 IfcM I lIJci dlUI c 


J> / v.. 


IDA 

1 YJr\ 


/Vll Vj J-O. D 


Qa 1 in \t\/ 
jdl II 1 lly 


1 Inl/nrwA/n 
\J \ 1 M IUW 1 1 


IDA 
1 1 Jr\ 


KA\C,^> 11 

\\\\ VJ J-ZZ 


^Aygtrll I trUU 1 1 1:1 1 lt:l 1 1 


r\Kz\ UIJIL. 


IDA 

1 yJi\ 




Carbon source 


Unknown 


NAS 




Energy source 


Unknown 


NAS 


MIGS-6 


Habitat 


Human gut 


IDA 


MIGS-15 


Biotic relationship 


Free living 


IDA 


MIGS-14 


Pathogenicity 


Unknown 






Biosafety level 


2 






Isolation 


Human feces 




MIGS-4 


Geographic location 


France 


IDA 


MIGS-5 


Sample collection time 


January 2012 


IDA 


MIGS-4.1 


Latitude 


43.296482 


IDA 


MIGS-4.2 


Longitude 


5.36978 




MIGS-4. 3 


Depth 


Surface 


IDA 


MIGS-4.4 


Altitude 


0 m above sea level 


IDA 



Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report 
exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, 
isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). 
These evidence codes are from the Gene Ontology project [33]. If the evidence is IDA, then the prop- 
erty was directly observed for a live isolate by one of the authors or an expert mentioned in the 
acknowledgements. 



206 



Standards in Genomic Sciences 



Mishra ef al. 



99 r 



100 



58 



98 



100L 



Jeotgalicoccus halotolerans (AY028925) 

— Jeotgalicoccus psychrophilus (AY028926) 

— Jeotgalicoccus marinus (EU583727) 

Nosocomiicoccus ampullae (EU240886) 

Nosocomiicoccus massiliensis (JX424771) 
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Salinicoccus roseus (X94559) 

Salinicoccus siamensis (AB258358) 
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100 



100 



100 
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Staphylococcus pseudintermedius (AJ780976) 

Staphylococcus caprae (AB009935) 

92 Ij — Staphylococcus hominis (X66101) 



90 L 



Staphylococcus lugdunensis (AB009941) 
Marinococcus halophilus (X90835) 
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Figure 1. Phylogenetic tree highlighting the position of Nosocomiicoccus massiliensis strain NP2 T relative to a se- 
lection of type strains of validly published type strains within the Staphylococcaceae family. CenBank accession 
numbers are indicated in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences 
obtained using the maximum-likelihood method within MEGA program. Numbers at the nodes are percentages of 
bootstrap values obtained by repeating the analysis 500 times to generate a majority consensus tree. 
Marinococcus halophilus was used as the outgroup. The scale bar represents a 2% nucleotide sequence diver- 
gence. 
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Figure 2. Cram staining of N. massiliensis strain NP2 T 



Figure 3. Transmission electron microscopy of N. massiliensis strain NP2 T , using a Morgani 
268D (Philips) at an operating voltage of 60kV. The scale bar represents 900 nm. 
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Different growth temperatures (25, 30, 37, 45°C] 
were tested. Growth was observed between 25 and 
45°C, with optimal growth at 37°C after 24 hours of 
incubation. Colonies were 1 mm in diameter on 
blood-enriched Columbia agar. Growth of the strain 
was tested on 5% sheep blood agar, under anaerobic 
and microaerophilic conditions using GENbag anaer 
and GENbag microaer systems, respectively 
(BioMerieux], and under aerobic conditions, with or 
without 5% CO2. The strain optimal growth was ob- 
tained aerobically, weak growth was observed in 
microaerophilic but no growth was observed under 
anaerobic atmospheres. Gram staining showed 
Gram-positive coccus. The motility test was positive. 
Cells grown on agar are Gram-positive cocci (Figure 
2] and have a mean diameter of 0.72 |im as deter- 
mined by electron microscopy (Figure 3). 

Strain NP2 T exhibited catalase but no oxidase activi- 
ties. Using an API 20NE strip (BioMerieux, Marcy 
l'Etoile], negative reactions were obtained for nitrate 
reduction, urease, indole production, glucose fer- 
mentation, arginine dihydrolase, (B-galactosidase, 
glucose, arabinose, mannose, mannitol, N-acetyl- 
glucosamine, maltose, gluconate, caprate, adipate, 
malate, citrate, phenyl-acetate and cytochrome oxi- 
dase. Substrate oxidation and assimilation was ex- 
amined with an API 50CH strip (BioMerieux] at the 
optimal growth temperature but sugar fermentation 
reactions and assimilation were not observed. N. 
massiliensis strain NP2 T was susceptible to amoxicil- 
lin, imipenem, rifampicin, vancomycin doxycycline 
and gentamicin but resistant to trime- 
thoprim/sulfamethoxazole, metronidazole and 
ciprofloxacine. When compared with representative 
species from the family Staphylococcaceae, N. 
massiliensis strain NP2 T exhibited the phenotypic 
differences detailed in Table 2. 

Matrix-assisted laser-desorption/ionization time-of- 
flight (MALDI-TOF] MS protein analysis was carried 
out as previously described [36] using a Micro flex 
spectrometer (Bruker Daltonics, Leipzig, Germany]. 
Twelve individual colonies were deposited on a MTP 
384 MALDI-TOF target plate (Bruker]. The twelve 
NP2 T spectra were imported into the MALDI 
BioTyper software (version 2.0, Bruker] and ana- 
lyzed by standard pattern matching (with default 
parameter settings] against the main spectra of 4, 
706 bacteria, including spectra from one validly pub- 
lished species of Nosocomiicoccus, used as reference 
data in the BioTyper database. A score enabled the 
presumptive identification and discrimination of the 
tested species from those in a database: a score > 2 



with a validly published species enabled the identifi- 
cation at the species level; and a score < 1.7 did not 
enable any identification. For strain NP2 T , no signifi- 
cant score was obtained, suggesting that our isolate 
was not a member of any known species (Figures 4 
and 5]. 

Genome sequencing information 
Genome project history 

The organism was selected for sequencing on the 
basis of its phylogenetic position and 16S rRNA 
similarity to other members of the genus 
Nosocomiicoccus, and is part of a "culturomics" 
study of the human digestive flora aiming at isolat- 
ing all bacterial species within human feces. It was 
the first genome of a Nosocomiicoccus species and 
the first genome of Nosocomiicoccus massiliensis sp. 
nov. A summary of the project information is 
shown in Table 3. The Genbank accession number 
is CAVG00000000 and consists of 154 contigs. Ta- 
ble 3 shows the project information and its associa- 
tion with MIGS version 2.0 compliance [37]. 

Growth conditions and DNA isolation 

N. massiliensis sp. nov. strain NP2 T , (= CSURP246 = 
DSM 26222], was grown aerobically on M17 agar 
medium at 37°C. Five Petri dishes were spread and 
resuspended in 3x100 ul of G2 buffer (EZ1 DNA Tis- 
sue kit, Qiagen]. A first mechanical lysis was per- 
formed by glass powder on the Fastprep-24 device 
(Sample Preparation system, MP Biomedicals, USA] 
for 2x20 seconds. DNA was treated with 2.5 |ig/n.L 
of lysozyme (30 minutes at 37°C] and extracted 
using the BioRobot EZ 1 Advanced XL (Qiagen]. The 
DNA was concentrated and purified on a Qiamp kit 
(Qiagen]. The yield and the concentration of DNA 
was 69.3 ng/[il as measured by using Quant-it 
Picogreen kit (Invitrogen] on the Genios Tecan 
fluorometer. 

Genome sequencing and assembly 

DNA (5 \ig) was mechanically fragmented for the 
paired-end sequencing, using a Covaris device 
(Covaris Inc., Woburn, MA,USA] with an enrich- 
ment size of 3-4 kb. The DNA fragmentation was 
visualized through an Agilent 2100 BioAnalyzer on 
a DNA Labchip 7500 which yielded an optimal size 
of 3.4 kb. The library was constructed using a 454 
GS FLX Titanium paired-end rapid library protocol. 
Circularization and nebulization were performed 
and a pattern of optimal size of 589 bp was gener- 
ated. PCR amplification was performed for 17 cy- 
cles followed by double size selection. The single- 
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stranded paired-end library was quantified using a equivalence was calculated as 1.42 x 10 10 mole- 

Quant-it Ribogreen Kit (Invitrogen] using a Genios cules/uJL. The library was stored at -20°C until fur- 

Tecan fluorometer. The library concentration theruse. 



Table 2. Differential characteristics of Nosocomiicoccus species* 



Properties 


N. massiliensis N. ampullae 


J. psychrophilus 


M. caseolyticus 


S. pseudointermedius 


S. albus 


Cell diameter (|jm) 
Oxygen requirement 


0.72 na 
aerobic aerobic 


0.6-1.1 
anaerobic 


1.1-2 
Facultative 
anaerobic 


1.0-1.5 
aerobic 


1.0-2.0 
aerobic 


Pigment production 


+ + 


+ 


+ 


- 


- 


Gram stain 


+ + 


+ 


+ 


+ 


+ 


Salt requirement 


+ 


+ 


+ 


na 


+ 


Motility 

Peptidoglycan type 


+ 

t-Lys-GIV/- 
t-Ser(Gly) 


_ 

L-Lys-GIV} -L- 
Ala(Gly) 


_ 

t-tys-Gly 3 .-t- 
Ser-teichoic ac- 

irl 

la 


_ 
na 


_ 

L-Lys-Gly 5 


Endospore formation 












Production of 












Acid phosphatase 


- 


+ 


+ 


+ 


+ 


Catalase 


+ + 


+ 


+ 


+ 


+ 


Oxidase 


+ 


+ 


+ 


- 


+ 


Nitrate reductase 


- 


- 


+ 


+ 


+ 


Urease 


- 


- 


- 


+ 


+ 


B-galactosidase 


- 


na 


- 


+ 


- 


N-acetyl-glucosamine 


- 


na 


na 


+ 


- 


Acid from 












t-Arabinose 


- w 




+ 




+ 


Ribose 


— — 


— 


+ 


+ 


+ 


Mannose 






+ 


+ 


+ 


Mannitol 






+ 


w 


+ 


Sucrose 


- 


w 


na 


+ 


+ 


D-glucose 






+ 


+ 


+ 


D-fructose 






+ 


+ 


+ 


D-maltose 






+ 


+ 


+ 


D-lactose 






+ 


+ 




Hydrolysis of gelatin 

G+C content (mol%) 


+ + 
36.4 33.5 


42 


na 
36.5 


na 
37.5 


+ 

43.88 


Habitat 


surface of 
human gut used saline 
bottles 


fermented 
seafood 


raw cow milk 
and dairy 
products 


skin and mucosal 
surfaces of most 
healthy dogs 


subterranean 
brine sample 



na = data not available; w = weak 

'Nosocomiicoccus massiliensis strain NP2 T , Nosocomiicoccus ampullae strain TRF-1 T , jeotgalicoccus psychrophilus 
strain YKJ-115 1 , Macrococcus caseolyticus strain JCSC5402, Staphylococcus pseudointermedius strain ED 99 and 
Salinicoccus albus strain DSM 19776 
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Figure 4. Reference mass spectrum from N. massiliensis strain NP2 T . Spectra from 12 individual colonies 
were compared and a reference spectrum was generated. 



Spectrum r 

Staphylococcus pseudintermedius DSM21284T » 



Staphylococcus lugdunensis DSM 4805 



Staphylococcus hominis 18 ESL 



Staphylococcus caprae DSM20608T DSM 



Nosocomiicoccus massiliensis 



Nosocomiicoccus ampullae 109506T 



8000 10000 



Figure 5. Gel view comparing N. massiliensis sp. nov strain NP2 1 and other Staphylococcus species. The gel view dis- 
plays the raw spectra of loaded spectrum files arranged in a pseudo-gel like look. The x-axis records the m/z value. 
The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity 
is expressed by a Gray scale scheme code. The color bar and the right y-axis indicate the relation between the color a 
peak is displayed with and the peak intensity in arbitrary units. Displayed species are indicated on the left. 
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Table 3. Project information 


uirc m 
/VIIvjj 1 u 


r ropcrly 


Term 


Mltuo-i I 


Finishing quality 


High-quality draft 


MILjo-zo 


Libraries used 


One 454 paired end 3-kb library 


IVHLjO-Z y 


sequencing pidiTorms 


A^A Fl Y Titanium 

^tj^ vjo rLA i Itanium 


MILjo-j 1 .z 


Fold coverage 


y4.y/ x 


MILjo-jU 


Assemblers 


Newbler version 2.5.3 


MILiS-3z 


Gene calling method 


Prodigal 




Genbank Bioproject 


PRJEB644 




Genbank Date of Release 


May 6, 2013 




Gold ID 


Gi22016 


MIGS-13 


Project relevance 


Study of the human gut microbiome 



For the shotgun sequencing, DNA (500 ng] was me- 
chanically fragmented using a Covaris device 
(Covaris Inc.] as described by the manufacturer. The 
DNA fragmentation was visualized using an Agilent 
2100 BioAnalyzer on a DNA Labchip 7500 which 
yielded an optimal size of 1.7 kb. The library was 
constructed using the GS Rapid library Prep kit 
(Roche] and quantified using a TBS 380 mini 
fluorometer (Turner Biosystems, Sunnyvale, CA, 
USA]. The library concentration equivalence was 
calculated as 2.8 x 10 9 molecules/u,L. The library was 
stored at -20°C until further use. 

The shotgun library was clonally amplified with 1 
and 2 cpb in two emPCR reactions each, and the 
paired-end library was amplified with 0.5 cpb in 
three emPCR reactions using the GS Titanium SV 
emPCR Kit (Lib-L] v2 (Roche]. The yields of the 
emPCR were 6.8 and 9.8%, respectively, for the 
shotgun library, and 11.29% for the paired-end li- 
brary. These yields fall into the expected 5 to 20% 
range according to Roche protocol. 

For each library, approximately 790,000 beads for a 
quarter region were loaded on the GS Titanium 
PicoTiterPlate PTP kit and sequenced with the GS 
FLX Titanium Sequencing Kit XLR70 (Roche]. The 
run was performed overnight and analyzed on a 
cluster using the gsRunBrowser and Newbler as- 
sembler (Roche]. For the shotgun sequencing, 
188,659 passed-filter wells were obtained. The se- 
quencing generated 129.3 Mb with a length average 
of 685 bp. For the paired-end sequencing, 106,675 
passed-filter wells were obtained. The sequencing 
generated 35 Mb with an average length of 262 bp. 
The passed-filter sequences were assembled using 
Newbler with 90% identity and 40 bp as overlap. 
The final assembly identified 12 scaffolds and 154 
contigs (> 1,500 bp] and generated a genome size of 



1.65 Mb, which corresponds to a coverage of 94.97 
genome equivalents. 

Genome annotation 

Open Reading Frames (ORFs] were predicted using 
Prodigal [38] with default parameters but the pre- 
dicted ORFs were excluded if they were spanning a 
sequencing gap region. The predicted bacterial pro- 
tein sequences were searched against the GenBank 
database [39] and the Clusters of Orthologous 
Groups (COG] databases using BLASTP. The 
tRNAScanSE tool [40] was used to find tRNA genes, 
whereas ribosomal RNAs were found by using 
RNAmmer [41] and BLASTn against the GenBank 
database. Lipoprotein signal peptides and numbers 
of transmembrane helices were predicted using 
SignalP [42] and TMHMM [43] respectively. 
ORFans were identified if their BLASTP E- value 
was lower than le -03 for alignment length greater 
than 80 amino acids. If alignment lengths were 
smaller than 80 amino acids, we used an £-value of 
le-05. Such parameter thresholds have already 
been used in previous works to define ORFans. To 
estimate the mean level of nucleotide sequence 
similarity at the genome level between N. 
massiliensis and three other members of the family 
Staphylococcaceae (Table 6], we used the Average 
Genomic Identity of Orthologous gene Sequences 
(AGIOS] home-made software. Briefly, this soft- 
ware combines the Proteinortho software (version 
1.4] [44] for detecting orthologous proteins be- 
tween genomes compared two by two, then re- 
trieves the corresponding genes and determines 
the mean percentage of nucleotide sequence identi- 
ty among orthologous ORFs using the Needleman- 
Wunsch global alignment algorithm. 
Nosocomiicoccus massiliensis strain NP2 T was com- 
pared to Macrococcus caseolyticus strain JCSC5402 
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(GenBank accession number NC_0 11999), Staphy- 
lococcus pseudointermedius strain ED 99 
(NC_017568], and Salinicoccus albus strain DSM 
19776 (ARQJ00000000). Artemis [45] was used for 
data management and DNA Plotter [46] was used 
for visualization of genomic features. The Mauve 
alignment tool was used for multiple genomic se- 
quence alignment and visualization [47]. 

Genome properties 

The genome of N. massiliensis strain NP2 T is 
1,6452,44 bp long (1 chromosome, but no plas- 
mid] with a 36.40% G + C content of (Figure 6 and 



Table 4]. Of the 1,783 predicted genes, 1,738 were 
protein-coding genes, and 45 were RNAs. Three 
rRNA genes (one 16S rRNA, one 23S rRNA and one 
5S rRNA] and 42 predicted tRNA genes were iden- 
tified in the genome. A total of 1,350 genes 
(75.71%] were assigned a putative function. Two 
hundred forty-six genes were identified as ORFans 
(13.79%]. The remaining genes were annotated as 
hypothetical proteins. The properties and the sta- 
tistics of the genome are summarized in Table 4 
and Table 5. The distribution of genes into COGs 
functional categories is presented in Table 5. 




800000 

Figure 6. Graphical circular map of the chromosome. From the outside in, the outer two circles shows open 
reading frames oriented in the forward (colored by COG categories) and reverse (colored by COG categories) 
direction, respectively. The third circle marks the rRNA gene operon (red) and tRNA genes (green). The fourth 
circle shows the G+C% content plot. The inner-most circle shows GC skew, purple indicating negative values 
whereas olive for positive values. 
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Table 4. Nucleotide content and gene count levels of the genome 



Attribute 


Value 


% of total 3 


Genome size (bp) 


1,645,244 




DNA coding region (bp) 


1,479,861 


89.94 


DNA G+C content (bp) 


5,98,869 


36.4 


Number of replicons 


1 




Extrachromosomal elements 


0 




Total genes 


1,783 


100 


RNA genes 


45 


2.52 


rRNA operons 


1 




Protein-coding genes 


1,738 


97.47 


Genes with function prediction 


1,511 


84.74 


Genes assigned to COGs 


1,350 


75.71 


Genes with peptide signals 


84 


4.71 


Genes with transmembrane helices 


425 


23.83 


CRISPR repeats 


0 


% of total a 



a The total is based on either the size of the genome in base pairs or the total num- 
ber of protein coding genes in the annotated genome 



Table 5. Number of 


genes associated with the 25 general COG functional categories 


Code 


Value 


% of total 3 


Description 


J 


144 


8.29 


Translation 


A 


0 


0 


RNA processing and modification 


K 


89 


5.12 


Transcription 


L 


111 


6.39 


Replication, recombination and repair 


B 


1 


0.06 


Chromatin structure and dynamics 


D 


21 


1.12 


Cell cycle control, mitosis and meiosis 


Y 


0 


0 


Nuclear structure 


V 


36 


2.07 


Defense mechanisms 


T 


39 


2.24 


Signal transduction mechanisms 


M 


80 


4.60 


Cell wall/membrane biogenesis 


N 


3 


0.17 


Cell motility 


Z 


0 


0 


Cytoskeleton 


W 


0 


0 


Extracellular structures 


u 


23 


1.32 


Intracellular trafficking and secretion 


o 


59 


3.39 


Posttranslational modification, protein turnover, chaperones 


c 


94 


5.41 


Energy production and conversion 


G 


65 


3.74 


Carbohydrate transport and metabolism 


E 


114 


6.56 


Amino acid transport and metabolism 


F 


55 


3.16 


Nucleotide transport and metabolism 


H 


73 


4.20 


Coenzyme transport and metabolism 


I 


46 


2.65 


Lipid transport and metabolism 


P 


108 


6.21 


Inorganic ion transport and metabolism 


Q 


28 


1.61 


Secondary metabolites biosynthesis, transport and catabolism 


R 


185 


10.64 


General function prediction only 


S 


137 


7.88 


Function unknown 




388 


22.32 


Not in COGs 



a The total is based on the total number of protein coding genes in the annotated genome. 



214 



Standards in Genomic Sciences 



Mishra ef al. 



Genome comparison of Nosocomiicoccus 
massiliensis with Macrococcus caseolyticus, 
Staphylococcus pseudointermedius and 
Salinicoccus albus 

We compared the genome of N. massiliensis strain 
NP2 T , with those of M. caseolyticus strain JCSC5402 
(GenBank accession number NC_011999] and S. 
pseudointermedius strain ED 99 (NC_017568], and 
S. albus strain DSM 19776 (ARQJ00000000]. The 
draft genome of N. massiliensis is smaller in size 
than those of M. caseolyticus, S. pseudointermedius 
and S. albus (1.6, 2.2, 2.5 and 2.6 Mb, respectively]. 
The G+C content of B. massiliensis is comparable to 
that of M. caseolyticus (36.40 and 36.56%, respec- 
tively] and lower than that of S. pseudointermedius 
and S. albus (37.56 and 43.88%, respectively]. The 
gene content of N. massiliensis is lower than those 



of M. caseolyticus S. pseudointermedius and S. albus 
(1,783, 2,113, 2,435 and 2,770, respectively]. The 
ratio of genes per Mb of N. massiliensis is larger to 
those of M. caseolyticus, S. pseudointermedius and S. 
albus (1,080, 956, 947 and 1,049, respectively]. 
However, the distribution of genes into COG cate- 
gories was not entirely similar in the four genomes 
(Figure 7]. 

The nucleotide sequence identity ranged from 
64.75 to 69.80% among the genera. Table 6 sum- 
marizes the numbers of orthologous genes and the 
average percentage of nucleotide sequence identity 
between the different genomes studied. 



J 




U W 



Figure 7. Distribution of functional classes of predicted genes on Nosocomiicoccus 
massiliensis (colored in blue), Macrococcus caseolyticus (colored in green), Staphylo- 
coccus pseudointermedius (colored in red) and Salinicoccus albus (colored in yellow) 
chromosomes according to the clusters of orthologous groups of proteins. 
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Table 6. The numbers of orthologous protein shared between genomes 

Nosocomiicoccus Micrococcus Salinicoccus Staohvlococcus 



massiliensis caseolyticus albus pseudointermedius 



Nosocomiicoccus 
massiliensis 


1,742 


995 


1,003 


954 


Macrococcus 
caseolyticus 


67.50 


2,216 


1,176 


1,127 


Salinicoccus 
albus 


66.22 


65.46 


2,680 


1,135 


Staphylococcus 
pseudointermedius 


67.48 


69.80 


64.75 


2,351 



Upper right triangle- average percentage similarity of nucleotides corresponding to orthologous pro- 
tein shared between genomes 

Lower left triangle- orthologous protein shared between genomes 
Bold- numbers of proteins per genome 



Conclusion 

On the basis of phenotypic, phylogenetic and ge- 
nomic analyses, we formally propose the creation 
of Nosocomiicoccus massiliensis sp. nov. that con- 
tains the strain NP2 T . This bacterium strain has 
been isolated from the fecal flora of an AIDS- 
infected patient living in Marseille, France. Several 
other undescribed bacterial species were also cul- 
tivated from different fecal samples through di- 
versification of culture conditions [5-22], thus 
suggesting that the human fecal flora of humans 
remains partially unknown. 

Description of Nosocomiicoccus massiliensis sp. nov. 

Nosocomiicoccus massiliensis (mas.si.li.en'sis. L. 
masc. adj. massiliensis of Massilia, the Roman 
name of Marseille, France, where the type strain 
was isolated]. 

Colonies are 1 mm in diameter on blood-enriched 
Columbia agar. Cells are cocci-shaped with a mean 
diameter of 0.72 u,m. Optimal growth is achieved 
aerobically and weak growth was observed 
microaerophilic condition. No growth is observed 



in anaerobic conditions. Growth occurs between 
25 and 45°C, with optimal growth observed at 
37°C. Cells stain Gram-positive, are non- 
endospore forming and are motile. Cells are nega- 
tive for nitrate reduction, urease, indole produc- 
tion, glucose fermentation, arginine dihydrolase, 
(B-galactosidase, glucose, arabinose, mannose, 
mannitol, N-acetyl-glucosamine, maltose, 
gluconate, caprate, adipate, malate, citrate, phe- 
nyl-acetate and cytochrome oxidase. Cells are sus- 
ceptible to amoxicillin, imipenem, rifampicin, 
vancomycin doxycycline and gentamicin but re- 
sistant to trimethoprim/sulfamethoxazole, metro- 
nidazole and ciprofloxacine. The G+C content of 
the genome is 36.40%. The 16S rRNA and genome 
sequences are deposited in Genbank under acces- 
sion numbers JX424771 and CAVG00000000, re- 
spectively. 

The type strain NP2 T (= CSUR P246 = DSM 26222) 
was isolated from the fecal flora of an AIDS- 
infected patient living in Marseille, France. 
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